Add closing of log stream at end of job #1090

theunb · 2025-09-29T17:17:52Z

Testing done

Submitter checklist

Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
Ensure that the pull request title represents the desired changelog entry
Please describe what you did
Link to relevant issues in GitHub or Jira
Link to relevant pull requests, esp. upstream and downstream changes
Ensure you have provided tests that demonstrate the feature works or the issue is fixed

theunb · 2025-09-29T17:23:18Z

The logs of builds are not closed and stay in memory. As a consequence the timers org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream$Flush keep on rescheduling themselves for completed jobs.

jglick

This should be called by WorkflowRun.finish. Do you have some way to reproduce the issue you observed?

theunb · 2025-09-29T18:05:28Z

I have it locally all the time. We have pipeline scripts (workflow-cps-plugin) and we notice that there is a steady increase of timers 'org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream$Flush' with a link to every build-log (console) that has run since start. We run latest Jenkins and latest plugins. Eventually going OOM and Jenkins restarts.
I can provide jobs config and logs if needed.

jglick · 2025-09-29T18:16:34Z

I can only speculate that something in your environment is causing https://github.com/jenkinsci/workflow-job-plugin/blob/c8d7f497b22f41c3acf8d0ea4b7b2c887be959d6/src/main/java/org/jenkinsci/plugins/workflow/job/WorkflowRun.java#L672 to not be run; or that this patch was not solving the issue either.

dwnusbaum · 2025-09-30T19:56:19Z

@theunb If you can reproduce the issue, we'd be interested to understand what is happening, but on its own the PR doesn't make sense. onProgramEnd adds a FlowEndNode to the program, which (asynchronously) triggers WorkflowRun$GraphL, which calls WorkflowRun.finish, which closes the listener as Jesse linked to earlier.

It sounds like you might already have heap dumps, and if so, it would be interesting to know the concrete types of WorkflowRun.listener and any delegate streams for builds experiencing the problem. If you have configuration and logs (e.g. a build and controller support bundle), that might help us figure out what is happening. I would recommend opening a Jira ticket for investigation purposes until we understand what is happening.

theunb · 2025-10-06T16:53:44Z

@dwnusbaum, we notice an ever increasing number of Timers. In two weeks time we have '12643769630' (twelve billion) executed timers and run about 5000 jobs per weekday, at max 600 in parallel.

I have a script to give an overview of these timers in the queue:
---- Script to get the queue ----
def parentClass = jenkins.util.InterceptingExecutorService
def field = parentClass.getDeclaredField("base")
field.setAccessible(true)
def baseExecutor = field.get(instance)
queue = baseExecutor.getQueue()
---- end -----

The rest of the script is long, but it basically goes reflection style to the $Flush task, and counts them as well as print out the referenced OutputStream and file reference and keeps a histogram of time until execution of the task.

The script reports on timer-tasks which have been completed, ie 'callable' and 'runner' are 'null' and 'queue.contains(timertask)' returns 'false' and instances of 'org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream$Flush' and the sum of both.
Here are some samples taken with 10 seconds interval (script also takes a couple of seconds to execute):

total: 165565 nullNotInQueue: 74139 outputStreams: 90999 sum: 165138
total: 165565 nullNotInQueue: 110482 outputStreams: 54661 sum: 165143
total: 165564 nullNotInQueue: 98322 outputStreams: 66820 sum: 165142
total: 165572 nullNotInQueue: 149777 outputStreams: 15368 sum: 165145
total: 165575 nullNotInQueue: 77326 outputStreams: 87821 sum: 165147
total: 165568 nullNotInQueue: 83434 outputStreams: 81713 sum: 165147
total: 165571 nullNotInQueue: 148694 outputStreams: 16455 sum: 165149
total: 165571 nullNotInQueue: 109282 outputStreams: 55867 sum: 165149
total: 165597 nullNotInQueue: 148936 outputStreams: 16241 sum: 165177

I guess the number 'nullNotInQueue' depends on how many 'org.jenkinsci.plugins.workflow.log.DelayBufferedOutputStream$Flush' timer-tasks that have been completed. The sum seems to remain stable.
The problem I see is that we run at max 600 jobs in parallel and when I sample the console-log linked to these Flush timers, they all seem to point to completed builds. The Flush timer only takes a weak reference to the OutputStream. I thought the OutputStream might not have been closed. The Flush timer reschedules itself until OutputStream is garbage collected.

Add closing of log stream at end of job

fa87b6b

theunb requested a review from a team as a code owner September 29, 2025 17:17

jglick requested changes Sep 29, 2025

View reviewed changes

jglick requested a review from dwnusbaum September 29, 2025 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add closing of log stream at end of job #1090

Add closing of log stream at end of job #1090

Uh oh!

theunb commented Sep 29, 2025

Uh oh!

theunb commented Sep 29, 2025

Uh oh!

jglick left a comment

Uh oh!

theunb commented Sep 29, 2025

Uh oh!

jglick commented Sep 29, 2025

Uh oh!

dwnusbaum commented Sep 30, 2025

Uh oh!

theunb commented Oct 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add closing of log stream at end of job #1090

Are you sure you want to change the base?

Add closing of log stream at end of job #1090

Uh oh!

Conversation

theunb commented Sep 29, 2025

Testing done

Submitter checklist

Uh oh!

theunb commented Sep 29, 2025

Uh oh!

jglick left a comment

Choose a reason for hiding this comment

Uh oh!

theunb commented Sep 29, 2025

Uh oh!

jglick commented Sep 29, 2025

Uh oh!

dwnusbaum commented Sep 30, 2025

Uh oh!

theunb commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

theunb commented Oct 6, 2025 •

edited

Loading